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EXECUTIVE  SUMMARY 


This  report  summarizes  the  literature  reviewed  in  preparation  for  planning  and 
executing  a  series  of  controlled,  operator-in-the-loop  (OITL)  experiments  to  determine 
how  an  air  and  missile  defense  (AMD)  battle  manager’s  performance  degrades  with 
increased  workload  and  how  automated  battle  management  aids  (ABMA)  can  moderate 
this  degradation.  The  sources  for  this  survey  range  from  studies  that  describe  the  basic 
limits  of  human  memory  capacity  to  those  that  assess  the  number  of  battle  managers 
needed  to  operate  a  partially  automated  missile  defense  system. 

The  research  indicates  that  without  the  assistance  of  automation,  a  battle  man¬ 
ager’s  performance  will  degrade  as  the  complexity  of  the  task  increases,  in  particular 
when  he  is  tasked  with  attending  to  more  than  seven  entities  or  decisions.  Battle  man¬ 
agers’  performance  may,  however,  vary  considerably  across  experience  levels  and  tasks. 
Prominent  factors  that  affect  the  overall  human-system  performance  include  the  battle 
manager’s  cognitive  capacity  and  the  system’s  level  of  automation. 

This  report  outlines  four  different  stages  and  eight  different  levels  at  which  auto¬ 
mation  can  enhance  system  and  human  performance.  An  abundance  of  research  indicates 
that  while  automation  may  decrease  operator  workload,  it  may  also  decrease  operator 
activity,  engagement,  and  attention,  which  could  lead  to  a  decrease  in  situational  aware¬ 
ness  and  performance.  There  is  no  shortage  of  research  showing  how  overreliance  on 
automation  results  in  fatal  accidents  when  the  automated  system  fails. 
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THE  EFFECTS  OF  AUTOMATION  ON 
BATTLE  MANAGER  WORKLOAD  AND  PERFORMANCE 


“People  are  flexible  but  inconsistent  ...  machines  are  consistent  but  inflexible  ” 

(Army  Research  Laboratory,  2005) 


A,  INTRODUCTION 

As  the  U.S.  air  and  missile  defense  (AMD)  eommunities  work  toward  a  joint 
integrated  AMD  solution  that  ineludes  a  single  integrated  air  pieture  (SIAP),  integrated 
fire  eontrol,  and  automated  battle  management  aids  (ABMAs),  the  eomplexity  of  the 
system  and  time-sensitive  threat  environment  will  inerease  the  eognitive  demands  plaeed 
on  human  battle  managers.  The  future  integrated  environment  will  require  eooperation 
among  multiple  Serviees  and  platforms  (e.g.,  Patriot,  Aegis,  Theater  High  Altitude  Area 
Defense  (THAAD),  fighter  aireraft.  Joint  Land  Attaek  Cruise  Missile  Defense  (LACMD) 
Elevated  Netted  Sensor  System  (JLENS),  Airborne  Warning  and  Control  System 
(AWACS),  E-2))  to  eounter  enemy  air  threats  while  ensuring  the  safe  operation  of 
friendly  aireraft.  As  the  number,  variety,  and  eapabilities  of  air  and  missile  defense  plat¬ 
forms  inerease,  the  ABMA’s  role  in  assisting  the  joint  forees  battle  manager  in  taetieal 
deeision-making  will  beeome  inereasingly  important.  An  ABMA  that  assists  the  battle 
manager  by  exeeuting  the  best  possible  set  of  deeision-making  tasks  at  just  the  right  time 
and  to  the  most  appropriate  extent  will  optimize  the  overall  human-system  performanee. 
To  this  end,  the  Institute  for  Defense  Analyses  (IDA)  has  been  tasked  to  support  the  Joint 
Theater  Air  and  Missile  Defense  Organization  (JTAMDO)  in  determining  the  require¬ 
ments  for  an  ABMA.  More  speeifieally,  IDA  has  agreed  to  design,  plan,  and  exeeute  a 
series  of  eontrolled,  operator-in-the-loop  (OITE)  experiments  to  determine  how  an  AMD 
battle  manager’s  performanee  degrades  with  inereased  workload  in  the  eontext  of  various 
realistie  seenarios  and  how  an  ABMA  ean  mitigate  this  degradation.  This  report  summa¬ 
rizes  the  literature  reviewed  in  preparation  for  planning  and  exeeuting  these  OITE 
experiments.  The  insights  from  this  literature  survey  and  the  planned  experiments  eould 
be  used  to  guide  the  development  of  a  prototype  ABMA.  This  prototype  eould  then  be 
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deployed  and  studied  in  a  realistie  environment,  sueh  as  the  Virtual  Warfare  Center 
(VWC),i  for  verifieation  and  validation. 

The  objeetive  of  this  literature  review  was  to  ensure  that  this  researeh  is  novel  and 
has  not  already  been  fully  investigated,  to  validate  the  need  to  perform  the  planned 
experiments,  and  to  gain  insight  into  faetors  that  may  be  important  to  eonsider  while 
planning  the  OITL  experiments  and  analyses.  The  OITL  experiments  will  be  designed  to 
test  the  performanee  of  battle  managers  under  different  workload  eonstraints.  These  eon- 
straints  will  be  simulated  by  inereasing  or  deereasing  the  eomplexity  of  the  experiment 
seenarios.  The  experimental  design  will  also  inelude  eonditions  that  alter  the  level  of 
ABMA  decision-support  for  each  scenario  (an  example  is  outlined  in  Section  C  of  this 
paper).  The  experimental  participants  are  expected  to  be  AMD  battle  managers  who  are 
responsible  for  managing  (but  not  operating)  theater  missile  batteries  and  groups  of 
aircraft. 

During  the  planned  OITL  experiments,  AMD  battle  managers  will  be  asked  to 
make  decisions  in  the  context  of  realistic,  simulated,  theater-based  threat  scenarios.  The 
level  of  difficulty  and  the  level  of  decision  support  that  the  ABMA  provides  for  each 
scenario  will  vary.  The  battle  manager’s  performance  will  be  assessed  by  measuring  his 
ability  to  maintain  situational  awareness  and  make  appropriate,  effective,  and  timely  bat¬ 
tle  management  decisions.  The  decision  process  can  be  broken  down  into  five  steps, 
which  are  required  to  address  a  given  threat  scenario  at  a  particular  time:  (1)  identify  the 
targets  for  sensor-shooter  pairing,  (2)  establish  the  priority  order  for  making  pairing  deci¬ 
sions,  (3)  determine  sensor-shooter  pairings,  (4)  assess  which  pairings  meet  the  accept¬ 
ability  criteria,  and  (5)  provide  an  ordered  list  of  recommended  pairings.  The  ABMA  can 
take  over  any  combination  of  these  five  steps;  however,  in  all  cases,  the  human  battle 
manager  must  make  the  final  sensor-shooter  engagement  decision. 

This  survey  reviewed  over  50  sources,  primarily  articles,  books,  and  technical 
reports,  related  to  the  effects  of  automation  on  battle  manager  workload  and  performance 
in  AMD-related  domains.  The  sources  range  from  those  that  describe  the  basic  limits  of 
human  memory  capacity  to  those  that  assess  the  number  of  battle  managers  needed  to 
operate  a  partially  automated  missile  defense  system.  They  include  studies  within  the 
AMD  domain  and  across  other  similar  domains,  such  as  air  traffic  control.  Because  the 


1  The  Boeing  VWC  is  a  multioperator,  realistie,  human-in-the-loop  air  and  missile  defense  test  bed.  A 
typieal  VWC  setup  ineludes  manned  fighter  aireraft,  manned  airborne  sensors  and  surfaee-shooter 
platforms,  and  a  sizable  enemy  raid  eonsisting  of  a  mix  of  missiles  and  manned  aireraft. 


2 


term  battle  manager  is  speeifie  to  AMD-related  domains,  it  will  be  used  throughout  this 
report  to  refer  exclusively  to  an  air  and/or  missile  defense  battle  manager.  Otherwise,  the 
more  generic  and  familiar  term  “operator”  is  used.  Both  terms  refer  to  the  human  who  is 
controlling  or  managing  the  system  under  discussion. 

The  following  key  questions  were  used  to  guide  this  review  of  the  research  and  to 
organize  the  findings  in  this  paper: 

1.  Without  automation  assistance,  how  many  decisions  can  an  operator  handle 
per  unit  time?  At  what  point  does  operator  performance  drop  off,  and  does  it 
drop  off  gradually  or  abruptly? 

2.  Under  what  circumstances  will  automation  improve  operator  performance 
and  optimize  operator  workload? 

3.  Under  what  circumstances  might  automation  decrease  operator  performance 
and  situational  awareness  while  still  optimizing  operator  workload? 

None  of  the  sources  surveyed  has  fully  investigated  these  questions  in  an  AMD 
environment;  however,  many  of  the  studies  contain  important  implications  for  the 
planned  OITL  experiments.  The  next  three  sections  summarize  the  literature  that 
addresses  each  of  the  three  questions.  Section  E  summarizes  some  of  the  key  historical 
findings. 

B.  OPERATOR  PERFORMANCE  WITHOUT  AUTOMATION  ASSISTANCE 

This  section  reviews  literature  related  to  Question  1 : 

Without  automation  assistance,  how  many  decisions  can  an  operator  han¬ 
dle  per  unit  time?  At  what  point  does  operator  performance  drop  off,  and 

does  it  drop  off  gradually  or  abruptly? 

The  number  of  decisions  that  a  battle  manager  can  handle  is  largely  determined 
by  his  workload.  This  section  introduces  and  defines  the  concept  of  operator  workload 
and  then  discusses  the  impact  of  task  complexity  and  battle  managers’  behavioral  factors 
on  operator  workload. 

1,  Understanding  Battle  Manager  Workload 

Operator  workload  (or  simply  “workload”)  is  a  human  factor  that  describes  the 
cognitive  effort  involved  in  performing  a  task.  Understanding  workload  means  under¬ 
standing  at  what  point,  how,  and  to  what  degree  the  demands  of  the  task  or  situation 
exceed  the  operator’s  available  cognitive  resources.  Workload  varies  across  operators 
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and  tasks.  This  variation  stems  from  the  number  and  complexity  of  the  task(s),  the  ability, 
experience,  and  behavior  of  the  operator,  and  the  operational  techniques,  tactics,  and  pro¬ 
cedures  that  are  available  and  applicable.  While  consideration  of  these  factors  may  seem 
straightforward,  the  interaction  among  them  may  produce  nonlinear  combinations  of 
complex  situations  that  result  in  uncertain  outcomes.  As  such,  similarly  trained  and 
experienced  operators  may  respond  differently  to  the  same  situations  (Hilbum,  2004). 

In  the  air  traffic  control  domain,  operator  workload  has  traditionally  been  meas¬ 
ured  using  subjective  assessment  instruments  such  as  the  National  Aeronautics  and  Space 
Administration  (NASA)  Task  Load  Index  (TLX)  (Hart  &  Staveland,  1988),  the  Subjec¬ 
tive  Workload  Assessment  Technique  (SWAT)  (Reid  &  Nygren,  1988),  and  the  Work¬ 
load  Profile  (Tsang  &  Velazquez,  1996).  Rubio,  Diaz,  Martin,  &  Puente  (2004)  compare 
these  three  subjective  assessment  instruments,  all  of  which  involve  asking  the  operator  to 
self-assess  his  workload  by  considering  factors  such  as  stress  level,  effort,  and  mental 
demand. 

In  analyzing  the  data  from  our  planned  OITL  experiments,  we  will  take  a  differ¬ 
ent  approach.  We  will  calculate  workload  objectively  as  a  combination  of  contributions 
not  only  from  subjective  workload  meas¬ 
urement  instruments,  but  also  from  objec¬ 
tive  performance  metrics  and  task 
complexity.  In  particular,  workload  will  be 
measured  post-hoc  by  computationally 
estimating  the  quantity  and  complexity  of 
decisions  that  the  AMD  battle  manager  has  actually  made  for  a  given  scenario  (at  a  par¬ 
ticular  time,  t).  It  can  only  be  calculated  after  the  battle  manager  has  completed  the  task 
(up  to  time  f). 

Table  1  shows  all  the  factors  that  will  be  considered  and  their  corresponding 
assessment  metrics.  The  human  factors  are  listed  in  the  top  half  of  the  table,  while  the 
task-based  factors  are  listed  in  the  lower  half  of  the  table.  Task-based  factors  are  largely 
derived  from  the  complexity  of  the  scenario.  Task  complexity  (described  in  the  next  sec¬ 
tion)  will  be  calculated  by  considering  the  Inherent  Task  Complexity  and  the  Actual  Task 
Complexity  (the  last  two  rows  of  Table  1).  The  Inherent  Task  Complexity  is  derived  from 
the  scenario  (see  Section  B.2  for  the  theoretical  foundation  and  examples  of  this  con¬ 
cept).  It  is  strictly  a  task-related  factor — independent  of  operator  performance — that 


The  workload  analysis  for  the  planned 
OITL  experiments  will  consider  not 
only  subjective  workload  measure¬ 
ment  instruments,  but  also  objective 
performance  metrics  and  task 
complexity. 


4 


Table  1.  The  metrics  that  will  be  used  to  quantify  human  and  task-based 
factors  that  contribute  to  workload 


Factor 

Metric 

Experience 

Demographic  questionnaire 

Stress  level 

NASA  TLX 

SWAT  Assessment 

Observer  reports 

Confidence 

Logged  performance  data 

SWAT  Assessment 

CO 

Attention 

Observer  reports 

o 

o 

SWAT  Assessment 

cc 

LL 

Individual  differences 

Strategies  and  skills  applied  during  the  scenario  and  gath- 

c 

cc 

ered  from  logged  performance  data,  observer  reports,  and 

E 

After  Action  Reviews  (AARs) 

Demographic  questionnaire 

Performance 

Logged  performance  data 

#  of  shots  fired/missiles  launched 

#  of  hits/#  of  misses,  #  of  leakers,  #  of  impacts 

Effectiveness 

Speed 

Efficiency 

Commonality 

Scenario: 

Raid  size 

CO 

Inherent  Task  Complexity 

Blue  force  laydown 

o 

Red  force  laydown 

o 

cc 

Routing 

■Q 

Defended  assets 

0 

CO 

Timing 

CO 

1 

Order  of  events 

CO 

CO 

Actual  Task  Complexity 

A  measure  of  the  Inherent  Task  Complexity  reduced  by  the 

1— 

number  and  complexity  of  the  tasks  performed  by  the 

ABMA 

describes  the  complexity  of  the  scenario  at  any  given  time.  It  accounts  for  the  number  of 
decisions  involved  in  the  scenario  and  the  difficulty  (or  complexity)  of  each  decision. 
Both  static  and  dynamic  scenario-based  elements  contribute  to  the  inherent  task  com¬ 
plexity  in  AMD.  The  ordering  of  scenario  events  also  affects  the  inherent  task  complexity 
(Leonard  Adelman,  Bresnick,  Christian,  Gualtieri,  &  Minionis,  1997). 

The  metrics,  shown  on  the  right-hand  side  of  the  table,  include  both  standard 
assessment  instruments  (described  earlier  in  this  section)  and  variables  that  would  typi¬ 
cally  be  computed  and  logged  in  an  AMD  simulation.  For  example,  variables  that  affect 
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the  Inherent  and  the  Aetual  Task  Complexity  include  the  raid  size,  blue  force  laydown, 
routing,  defended  assets,  and  timing.  None  of  these  variables  change  across  operators. 

The  Actual  Task  Complexity  is  the  Inherent  Task  Complexity  reduced  by  the 
number  and  complexity  of  the  tasks  performed  by  the  ABMA.  If  no  ABMA  is  present, 
the  Actual  and  Inherent  Task  Complexity  values  are  the  same. 

By  combining  the  human  and  task-based  factors  in  Table  1,  estimates  of  workload 
can  be  calculated  per  unit  time,  accounting  for  changes  in  task  complexity  and  other  fac¬ 
tors  that  change  across  a  scenario.  This  notion  of  workload  accounts  for  the  decision  den¬ 
sity  (after-action  recounting  of  the  number  of  decisions  the  operator  made  per  unit  time) 
and  the  degree  of  difficulty  of  each  decision  for  each  operator.  The  next  four  sections 
(B.2-B.5)  explain  why  the  literature  suggests  that  different  operators  will  experience 
different  workloads  for  the  same  scenario  and  how  factors  such  as  the  operators’  level  of 
experience,  stress,  confidence,  and  other  human  factors  will  influence  their  performance. 

2,  Task  Complexity 

In  general,  an  operator’s  performance  is  expected  to  degrade  as  the  complexity  of 
the  task  increases.  Task  complexity  is  not  the  same  as  air  traffic  density  (in  air  traffic 
control)  or  air  raid  density  (in  missile  defense).  Through  the  late  1960s  and  1970s, 
research  suggested  that  air  traffic  density  and  radio  communications  were  the  main  con¬ 
tributors  to  air  traffic  controllers’  workload 
(Hurst  &  Rose,  1978;  Mogford,  Murphy,  & 

Guttman,  1994).  Although  these  factors  do 
contribute  to  workload,  Mogford  et  al. 

(1994)  showed  that  factors  such  as  the  rela¬ 
tive  frequency  of  complex  as  opposed  to  direct  aircraft  routings  and  the  need  for 
arrival/departure  sequencing  and  spacing  may  be  more  significant.^  Other  similar  studies 
suggest  that  factors  such  as  the  mixture  of  aircraft  types,  the  climbing  and  descending  of 
aircraft  flight  paths  (Histon  &  Hansman,  2002;  Mogford,  Guttman,  Morrow,  & 
Kopardekar,  1995),  and  the  degree  to  which  the  structure  of  the  airspace  dynamically 
changes  (Cummings  &  Tsonis,  2006)  also  contribute  significantly  to  task  complexity. 


A  battle  manager’s  performance  is 
expected  to  degrade  as  the  com¬ 
plexity  of  the  task  increases.  Com¬ 
plexity  factors  include  the  timing,  the 
quantity  and  order  of  events,  and  the 
degree  of  uncertainty. 


2  This  study  involved  administering  a  sequenee  of  questionnaires  to  over  50  air  traffie  eontrollers  at  the 
Federal  Aviation  Administration’s  Jaeksonville  Air  Route  Traffie  Control  Center. 
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Hilbum’s  (2004)  study  includes  a  meta-review  and  analysis  of  25  other  studies  that 
identify  factors  that  contribute  to  task  complexity  in  air  traffic  control. 

Figure  1  illustrates  how  the  complexity  of  the  situation  may  vary  independently 
from  traffie  density  (Hilbum,  2004).  Just  three  aircraft  have  changed  their  orientations 
from  the  figure  on  the  left-hand  side  to  the  one  on  the  right-hand  side.  The  air  traffic  den¬ 
sity  is  unchanged,  yet  the  eomplexity  of  the  situation  has  inereased  dramatically. 


Figure  1.  The  complexity  of  the  battlespace  varies  independently 
from  the  air  traffic  density  (Hilburn,  2004) 

The  timing  and  order  in  which  events  occur  also  affect  the  complexity  of  the  task. 
The  speed  at  which  events  oeeur  in  the  scenario  affects  the  number  of  tasks  the  operator 
must  attend  to  over  a  bounded  period  of  time  (Cannon-Bowers  &  Salas,  1998).  Events 
containing  uncertain  information  further  inerease  the  eomplexity.  In  the  Adelman,  Bres- 
nick,  Christian,  Gualtieri,  &  Minionis  (1997)  study,  43  Patriot  air  defense  operators  were 
asked  to  identify  simulated  ineoming  aircraft  as  friendly  or  hostile  and  then  engage  those 
that  were  determined  to  be  hostile.  As  aircraft  entered  the  airspace  that  contained  pro¬ 
tected  assets,  the  partieipants  were  given  sequences  of  eonfiieting  information  regarding 
the  interpretation  of  the  aircraft  in  question  (e.g.,  the  aircraft  responded  as  a  Friendly  to 
an  Interrogation-Friend-Foe  inquiry  and  then  jammed  the  Patriot’s  radar).  When  the 
information  in  these  sequences  was  reordered  slightly,  the  operators’  judgments  about  the 
unknown  aireraft  ehanged  signifieantly.  This  example  shows  how  ehanging  the  order  in 
which  information  is  presented  can  affect  the  complexity  of  the  task. 

Task  complexity  can  be  mediated  by  the  fidelity  and  the  design  of  the  operator 
display.  The  display  is  eomposed  of  statie  and  dynamie  environmental  constituents,  eaeh 
of  which  contributes  to  the  overall  complexity.  Static  elements  include  geographic  vari¬ 
ables  such  as  terrain,  land  and  sea  boundaries,  situated  sensors,  and  other  assets  that  do 
not  change.  Dynamic  elements  include  moving  aircraft,  missiles,  and  weather  conditions 
(if  they  ehange  during  the  time  period  under  eonsideration).  The  depiction  of  the  statie 
and  dynamic  elements  on  the  interface  can  also  affect  the  complexity  of  the  task.  For 
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example,  the  interfaee  may  affect  an  operator’s  ability  to  communicate  and  issue  critical 
commands,  to  locate  “hot  spots”  (locations  where  critical  events  often  occur),  to  manage 
potential  conflicts,  and  to  visualize  groups  of  aircraft  or  other  entities  as  generalized  rep¬ 
resentations  that  can  more  easily  be  managed  (Histon  &  Hansman,  2002). 

McDermott,  Klein,  Thordsen,  Ransom,  &  Paley  (2000)  provide  an  illustrative 
example  of  how  battle  manager  display  features  can  be  modified  according  to  the  com¬ 
plexity  of  various  tasks.  They  conducted  cognitive  task  analysis  interviews  with  Airborne 
Laser  (ABL)  program  managers  (PMs),  an  ABL  subject  matter  expert  (SME),  and  several 
crew  members  who  had  participated  in  the  Joint  Expeditionary  Eorce  Exercise  1999 
(JEEX-99).  Erom  these  interviews,  the  researchers  developed  a  sorted  list  of  tasks,  deci¬ 
sions,  and  functions  involved  in  ABE  missions.  The  relative  time  to  complete  each  task 
and  the  relative  workload  each  task  imposed  on  the  ABL  battle  manager  were  also  esti¬ 
mated.  Each  task  was  also  assigned  a  rating  (high,  medium,  or  low)  to  indicate  how  cog¬ 
nitively  challenging  it  was.  In  our  terms,  this  rating  represents  the  inherent  task 
complexity.  Eor  some  of  the  more  cognitively  complex  tasks,  McDermott  et  al.  provided 
interface  modification  recommendations  that  were  observed  to  alleviate  the  ABL  battle 
manager’s  workload.  These  recommendations  included  options  such  as  enabling  the  bat¬ 
tle  manager  to  toggle  track  numbers  on  or  off,  to  put  their  own  designators  on  tracks,  and 
to  calculate  the  range  between  objects.  Table  2  includes  a  sampling  of  recommendations 
relevant  to  AMD  and  the  battle  management  function  that  they  were  intended  to  address 
in  the  ABL  domain. 

Later  in  this  paper.  Sections  C  and  D  explain  how  the  level  of  automation,  or  in 
our  case,  the  ABMA,  acts  as  another  factor  that  can  mediate  the  complexity  of  the  task 
and  possibly  reduce  the  operator’s  perceived  workload.  The  ABMA  is  not  likely  to  affect 
operator  performance  in  a  linear  fashion.  The  degree  to  which  the  ABMA  will  affect  a 
battle  manager’s  performance  will  depend  on  other  factors,  such  as  his  domain  experi¬ 
ence  and  his  ability  to  adapt  to  the  ABMA.  Eor  example,  the  utility  of  the  ABMA  will 
likely  increase  as  the  inherent  task  complexity  increases.  As  the  scenario  becomes  more 
complex,  the  operator  will  eventually  become  overloaded  and  will  need  to  rely  on  the 
ABMA.  However,  as  the  operator  begins  to  rely  more  on  the  ABMA,  he  may  perceive 
that  the  complexity  and  difficulty  of  the  task  decreases,  perhaps  significantly.  The 
operator  may  gradually  begin  to  play  a  monitoring  role  rather  than  that  of  an  active  con¬ 
troller  role.  This  means  that  the  task  complexity,  as  it  is  perceived  by  the  operator. 
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Table  2.  A  sampling  of  human-computer  interface  recommendations  designed  to  alleviate 
ABL  battle  management  workload  (from  McDermott  et  al.,  2000) 


Display  Concept  Recommendation 

ABL  Battle  Management  Function 
Addressed 

Include  designators  for  tracks 

Monitor  enemy  tracks 

Include  range  rings  for  surface-to-air  missile 
sites 

Gauge  threat  to  ABL 

Enable  operators  to  differentiate  between  track 
types  (e.g.  surface,  air,  ground)  and  toggle 
track  numbers  on  or  off 

Filter  and  sort  information 

Allow  operators  to  put  their  own  designators 
on  tracks 

Detect  problems  and  inconsistencies  in  track 
data 

Make  high-value  assets  salient  on  the  display 

Monitor  location  of  high-value  assets 

Create  an  automated  Air  Tasking  Order  (ATO) 
that  battle  managers  can  access  from  their 
displays 

Reassign  tasks/orchestrate  priorities 

Record  information  about  missile  launches 
(e.g.,  launch  time  and  location,  track  number, 
actions  taken,  results) 

Determine  trends  of  launch  locations 

Use  the  information  about  past  missile 
launches  to  predict  and  prepare  for  future 
launches 

Anticipate  future  launches 

Enable  the  battle  manager  to  calculate  the 
distance  between  objects 

Recommend  changes  in  orbit  or  speed 

Display  messages  to  inform  the  operator  why 
the  system  cannot  execute  an  instruction  (e.g., 
system  cannot  fire  because  of  inability  to 
acquire  target) 

Know  weapon  status  and  if  weapon  is  ready 
to  fire 

Show  two  correlated  displays  from  the  same 
perspective 

Deconflict  missiles  from  different  locations 

Sound  an  audio  alarm  and  dim  all  other  tracks 
when  a  missile  is  launched  so  that  the  new 
threat  is  easily  detected 

Detect  and  recognize  launch 

Allow  the  battle  manager  to  zoom  in  and  out  to 
get  a  better  picture 

Report  results 

varies  with  the  level  of  automation  set  by  the  ABMA.  From  an  analytical  perspective, 
this  renders  the  task  complexity  inappropriate  as  an  independent  variable.  Section  D 
addresses  this  in  more  detail. 
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3.  Battle  Manager  Cognitive  and  Behavioral  Factors 

Task  complexity  is  just  one  factor  that  contributes  to  an  operator’s  workload.  The 
operator’s  performance  will  also  depend  on  his  experience  and  cognitive  ability 
(Mogford  et  ah,  1995).  As  early  as  1955,  experimental  evidence  revealed  that  the  amount 
of  information  a  person  ean  cognitively  proeess  at  once  does  not  increase  linearly  with 
the  amount  of  information  presented  to  him  (G.  Miller,  1955).  A  well-known  limit  to 
memory  capaeity  is  present  in  all  domains  (about  7  items),  and  this  limit  applies  even 
aeross  fundamentally  different  stimuli.  This  short-term  memory  eapaeity  limitation 
manifests  itself  in  a  variety  of  tasks  and  materials,  but,  most  typieally,  it  is  measured  by 
memory  span  tasks.  In  these  tasks,  subjeets  are  presented  several  unrelated  items  at  a 
standard  rate  and  asked  to  recall  them  in  order.  Memory  span  is  defined  as  the  maximum 
number  of  items  that  can  be  reealled  eorrectly. 

As  originally  conceived,  short-term  memory  capacity  is  a  fundamental  human 
eapability  that  underlies  a  variety  of  cognitive  tasks.  In  reality,  performanee  on  memory 
span  tasks  eorrelates  with  performanee  on  similar  rote  memory  tasks,  but  it  does  not  nec¬ 
essarily  relate  to  performance  on  more  eomplex  tasks  that  would  seem  to  depend  on 
short-term  memory,  such  as  reading  comprehension. 

Almost  20  years  after  the  introduction  of  Miller’s  eoncept  of  short-term  memory 
eapaeity,  Baddeley  &  Hitch  (1974)  introduced  the  notion  of  working  memory  (WM).  The 
WM  eoncept  viewed  short-term  memory  eapaeity  as  the  result  of  a  dynamic  executive 
that  controls  temporary  storage,  rehearsal,  and  attention  proeesses.  In  a  simplistic  sense, 
the  WM  concept  is  more  inclusive  than  short-term  memory  because  it  accounts  not  only 
for  short-term  storage,  but  also  for  all  the  processes  that  control  it. 

As  the  WM  coneept  matured,  researehers  began  to  develop  ways  to  measure  it. 
The  breakthrough  was  the  realization  that  if  WM  includes  the  proeess  for  allocating 
attention,  a  WM  test  must  require  the  performer  to  cope  with  multiple  memory  demands. 
Thus,  WM  is  measured  using  the  “dual-task”  paradigm  wherein  a  person  is  asked  to  do 
two  or  more  qualitatively  different  tasks  simultaneously.  These  dual  tasks  take  on  a  vari¬ 
ety  of  forms,  but  the  task  deseribed  by  Engle  (2002)  is  representative: 

...  Subjeets  read  aloud  a  series  of  operation-word  strings  such  as  ‘Is  4/2  + 

3  =  6?  (yes  or  no)  DOG.’  They  respond  as  to  whether  or  not  the  equation 
is  eorreet  then  read  the  capitalized  word  aloud.  After  a  set  of  two  to  seven 
such  operation-word  strings,  we  measure  the  number  of  words  reealled  . . . 
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In  contrast  to  memory  span  performance,  WM  task  performance  correlates  with  a 
wide  range  of  higher  order  cognitive  tasks,  such  as  reading  and  listening  comprehension, 
the  ability  to  follow  directions,  note  taking,  reasoning,  bridge  playing,  and  even  writing 
computer  programs.  Engle  speculates  that  WM  performance,  especially  WM  with  a  sim¬ 
ple  storage  capacity  statistically  controlled,  corresponds  to  the  fluid  intelligence  con¬ 
struct.  Fluid  intelligence  is  the  ability  to  draw  inferences  and  relationships  in  new 
problems,  independent  of  acquired  knowledge. 

The  WM  concept  complicates  the  answers  to  JTAMDO  questions,  such  as  “how 
many  unassisted  decisions  per  unit  time?”  Engle  (2002)  discussed  how  the  concept  of 
WM  changes  our  notions  of  short-term  memory  capacity,  which 

. . .  often  conjures  up  images  of  a  limited  number  of  items  or  chunks  that 
can  be  stored  (e.g.,  7  ±  2).  However,  my  sense  is  that  WM  capacity  is  not 
about  individual  differences  in  how  many  items  can  be  stored  per  se  but 
about  differences  in  the  ability  to  control  attention  to  maintain  information 
in  an  active,  quickly  retrievable  state  ...  (p.  20). 

On  the  other  hand,  compared  to  the  traditional  static  concept  of  short-term  mem¬ 
ory,  the  concept  of  WM  seems  more  relevant  to  performance  on  command  and  control 
(C2)  tasks,  particularly  the  time-sharing  demands  of  such  activities.  For  example,  Adel- 
man.  Miller,  and  Yeo  (2004)  showed  how  an  operator’s  WM  capacity  can  directly  affect 
his  performance  in  air  defense  tasks.  Their  task  involved  determining  the  threat  level  of 
air-breathing  targets  that  enter  a  set  of  concentric  rings  on  a  radar  display  and  making 
engagement  decisions  for  those  targets.  Participants  were  given  the  airspeed,  course,  and 
range  information  for  each  target,  and,  in 
some  experimental  cases,  they  received 
additional  altitude  and  radar  information. 

The  rate  at  which  targets  appeared  was  also  manipulated  to  vary  the  time  pressure  across 
experimental  conditions.  Before  performing  the  task,  the  participants  completed  a  WM 
capacity  test  in  which  they  viewed  numbers  that  flashed  in  sequence  on  a  monitor  and 
determined  whether  each  number  was  the  same  as  the  one  that  flashed  one,  two,  or  three 
numbers  earlier  in  the  sequence.  Adelman  et  al.  (2004)  found  that  performance  on  this 
WM  task  correlated  positively  with  participants’  decision  accuracy  on  the  air  defense 
task.  The  largest  effect  occurred  in  those  situations  in  which  participants  were  asked  to 
consider  the  maximum  quantity  of  information  (including  the  additional  altitude  and 
radar  information)  to  make  a  decision. 


An  operator’s  WM  capacity  can 
directly  affect  his  performance  in  air 
defense  tasks. 
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If  human  memory  capacity  is  limited,  as  the  psychological  and  air  defense 
research  suggests,  we  should  expect  the  performance  of  an  AMD  battle  manager  to 
decline  rapidly  when  he  is  overloaded  with  more  than  seven  entities  or  decisions.  For 
example,  if  the  level  of  automation  (i.e.,  ABMA)  is  held  constant  or  turned  off  and  the 
complexity  of  the  scenario  (e.g.,  number  of  threats  per  unit  time)  is  gradually  increased, 
the  operator’s  effectiveness  and  efficiency  should  begin  to  decrease  at  some  point  in  time 
(see  Figure  2). 


Figure  2.  As  the  battle  manager  becomes  overloaded,  his  performance  will  decrease.  The 
point  in  time  and  rate  of  decrease  will  depend  on  his  WM  capacity 

A  person’s  experience  in  a  domain  can  also  change  his  cognitive  and  mental 
capacity  with  respect  to  that  domain.  Chess  is  one  of  the  complex  cognitive  domains  that 
has  been  studied  in  great  depth  to  assess  the  degree  to  which  experience  affects  mental 
capacity.  Simon  (1974)  describes  the  performance  of  novices  and  grandmasters  who  were 
asked  to  reproduce  chess  board  configurations  after  they  were  given  5  to  10  seconds  to 
study  them; 

If  the  pieces  represent  a  position  from  an  actual  game  (unknown  to  the 
subjects),  then  grandmasters  and  masters  will  generally  reproduce  the 
position  (about  20  to  25  pieces)  almost  without  error,  while  ordinary  play¬ 
ers  will  generally  be  able  to  place  only  a  half  dozen  pieces  correctly.  If  the 
same  number  of  pieces  is  placed  on  the  board  in  a  random  pattern,  grand¬ 
masters  and  ordinary  players  alike  will  be  able  to  place  only  a  half-dozen 
pieces  correctly  (p.  487). 

This  effect  has  been  replicated  in  other,  more  complex  domains  such  as  physics 
(Larkin,  McDermott,  Simon,  &  Simon,  1980),  electronics  troubleshooting  (Gott  & 
Lesgold,  2000),  and  air  traffic  control  (Mogford  et.  al,  1995).  It  can  be  explained  in  part 
by  chunking:  mentally  recoding  items  into  aggregates  that  can  more  easily  be  recalled 
and  cognitively  processed.  Extensive  practice  in  a  domain  (Newell  &  Rosenbloom,  1981) 
is  likely  to  result  in  highly  efficient  methods  for  chunking  and  applying  mnemonics  and 
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other  domain-specific  memory  recoding  schemes,  thus  enabling  a  person  to  become 
highly  efficient  in  recognizing  and  remembering  domain-specific  elements,  procedures, 
or  situations.  Over  the  past  30  years,  the  cognitive  science  community  has  learned  that 
the  development  of  expertise  involves  much  more  than  improved  access  to  items  in 
memory.  It  renders  significant  changes  in  performance  and  process.  Experts  recognize 
recurring  patterns  and  act  on  compiled  combinations  of  principles  and  procedures  rather 
than  serially  and  systematically  considering  and  processing  individual  pieces  of  informa¬ 
tion  (Anderson,  1982;  Larkin  et  ah,  1980).  They  follow  cognitive  procedures  that  they 
have  automated  through  knowledge  and  practice  and  become  efficient  in  restructuring 
their  own  knowledge  to  select  and  evaluate  alternatives  when  necessary  (Gott  &  Lesgold, 
2000). 

In  our  case,  the  degree  of  error  across  a  pool  of  operators  who  have  different 
characteristics  is  likely  to  vary  significantly  based  on  backgrounds  and  experience.  It  will 
depend  on  the  quantity,  complexity,  and  context  of  the  decisions  that  the  operator  makes 
at  each  point  in  time  during  a  given  scenario.  It  can  be  calculated  per  unit  time  to  account 
for  changes  in  the  varying  complexity  of  the  task  and  other  factors  that  change  across  a 
scenario.  Figure  3  shows  how  these  factors  might  be  considered  for  characteristic  groups 
of  operators  (e.g.,  novice,  intermediate,  expert).  The  most  experienced  operators  might 
have  performance  curves  in  the  yellow  area,  indicating  that  they  are  able  to  handle  a 
more  difficult,  complex  scenario  without  as  much  performance  degradation.  The  point 
and  the  rate  of  performance  degradation  (slope  of  the  line)  are  unlikely  to  converge 
across  diverse  communities  of  operators.  These  factors  will  be  influenced  by  the  opera¬ 
tor’s  personal  characteristics,  including  his  mental  capacity,  experience,  and  confidence. 

Navice  Intermediate 


Figure  3.  Without  automation,  operator  performance  is  a  function  of  workioad,  memory 
capacity,  experience,  and  other  operator-specific  characteristics 
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The  term  “workload”  is  context  specific  and  describes  the  after-action  recounting 
of  the  number  of  decisions  the  operator  made  per  unit  time  combined  with  the 
subjectively  weighted  degree  of  difficulty 
of  each  decision.  Different  operators  will 
experience  different  workloads  at  different 
times  for  the  same  scenario.  Factors  such  as  the  operators’  level  of  experience,  stress, 
confidence,  mental  capacity,  and  other  human  factors  will  influence  their  workload, 
which,  in  turn,  will  influence  their  performance. 

4,  Decision-Making  Under  Uncertainty  in  Stressful,  High-Risk  Situations 

A  battle  manager’s  decision-making  behavior  is  affected  by  the  way  he  handles 
uncertainty  in  stressful,  high-risk  situations.  Because  decision-making  in  an  AMD  envi¬ 
ronment  involves  a  high  degree  of  risk,  it  is  worthwhile  to  consider  how  a  battle  man¬ 
ager’s  decision  process  might  change  when  he  is  presented  with  risky  situations 
involving  uncertain  information.  Research  on  human  decision-making  (Kahneman  & 
Tversky,  1979)  has  shown  that  the  way  people  perceive  risk  and  exhibit  risk-seeking  or 
risk-aversion  behavior  cannot  be  explained  in  a  computationally  logical  way  by  expected 
utility  theory.  Kahneman  &  Tversky  (1979)  developed  Prospect  theory  to  explain  human 
decision-making  behaviors  in  the  presence  of  risk.  The  theory’s  underlying  concept  is 
that  people  base  their  judgments  on  perceived  increases  or  decreases  in  value  caused  by 
gains  or  losses  (with  respect  to  some  reference  point),  with  less  regard  for  the  final  out¬ 
come.  The  theory  also  states  that  as  a  person  accumulates  losses  without  adapting  his 
reference  point,  his  tendency  toward  risk-taking  behavior  increases  (which  explains  the 
tendency  of  some  gamblers  to  increase  their  betting  during  a  losing  streak).  This  research 
may  be  important  to  consider  if  the  human  battle  manager,  who  may  be  applying  Pros¬ 
pect  theory  to  make  decisions,  misunderstands  the  ABMA’s  activities  because  it  is 
applying  a  logical  utility-based  theory  to  make  decisions  on  behalf  of  the  battle  manager. 

According  to  Prospect  theory,  when  people  are  presented  with  the  possibility  of 
winning,  they  tend  to  select  choices  that  minimize  risk  and  maximize  certainty,  even 
when  the  risk  is  insignificant.  The  one  exception  seems  to  be  situations  in  which  all  the 
choices  present  similar  gains  and  losses  and  “winning  is  possible,  but  not  probable” 
(p.  267).  A  good  example  of  this  situation  is  the  $5  lottery  ticket  for  which  there  is  a  very 
small  chance  of  winning  a  large  sum  of  money.  In  that  case,  people  tend  to  select  the 
choice  that  offers  the  greatest  possible  gains  and  accept  the  small  possible  loss.  On  the 
other  hand,  when  people  are  presented  with  the  possibility  of  losing  (instead  of  winning). 


Different  operators  will  experience 
different  workloads  at  different  times 
for  the  same  scenario. 
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they  exhibit  risk-seeking  behavior  that  attempts  to  minimize  loss,  at  the  risk  of  losing 
even  more. 

The  way  that  a  deeision-making  problem  is  presented  ean  signifieantly  affeet  the 
resulting  deeision.  For  example,  a  problem  in  which  the  decision-maker  is  given  $1,000 
and  asked  if  he  would  like  an  additional  $500  (the  risk- averse  decision)  or  a  50%  chance 
to  win  another  $  1 ,000  (the  risk-seeking  decision)  can  also  be  presented  as  one  in  which 
he  is  given  $2,000  and  asked  if  he  would  like  to  give  back  $500  (the  risk-averse  decision) 
or  risk  a  50%  possibility  of  losing  $1,000  (the  risk-seeking  decision).  In  the  first  case,  the 
decision-maker  is  likely  to  make  the  risk-averse  decision.  In  the  second  case,  he  is  likely 
to  take  the  risk-seeking  choice.  When  problems  are  broken  down  into  subproblems  that 
are  presented  sequentially,  each  requiring  an  independent  decision,  the  final  outcomes 
may  also  differ. 

Kahneman  and  Tversky’s  (1979)  Prospect  theory  may  have  far-reaching  implica¬ 
tions  in  AMD.  Once  a  battle  manager  has  attained  an  understanding  of  the  current  battle 
situation.  Prospect  theory  indicates  that  changes  in  the  situation  are  more  likely  to  affect 
his  decision  than  his  consideration  of  the  decision  outcome.  If  he  perceives  that  the  situa¬ 
tion  is  changing  to  favor  friendly  forces,  he  may  choose  a  risk-averse  decision  to  mini¬ 
mize  risk  and  maximize  certainty;  however,  if  he  perceives  that  the  situation  is  changing 
for  the  worse,  he  may  decide  to  make  riskier  choices.  Whether  this  type  of  innate  human 
behavior  is  representative  of  AMD  battle  managers’  decision-making  processes  and  can 
be  mediated  through  training  or  an  ABMA  remains  an  open  question. 

5,  Cultural  Differences 

Cultural  differences  may  also  affect  a  battle  manager’s  decision-making  behavior. 
In  a  study  funded  by  the  U.S.  Air  Force  Research  Laboratory  (AFRL)  from  2001-2004, 
Micro  Analysis  and  Design,  Inc.  (MA&D)3  assessed  the  contribution  of  cultural  factors 
to  operator  performance  in  an  Integrated  Air  Defense  System  (IADS)  (Mui  et  ah,  2004). 
The  cultural  factors  that  they  considered — distribution  of  power,  willingness  to  take  risk, 
and  familiarity  with  the  enemy — were  derived  from  an  analysis  of  differences  among 
cultures  performed  by  Hofstede  (1984)  and  consequently  described  by  Klein,  Pongonis, 
and  Klein’s  (2000)  Cultural  Lens  model.  Although  MA&D  was  able  to  model  a  range  of 


3  MA&D  was  acquired  by  Alien  Science  and  Technology  in  2006. 
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values  for  each  cultural  variable,  the  ranges  for  the  IADS  study  were  simplilied  to  “high” 
or  “low.” 

Mui  et  al.  (2004)  studied  three  scenarios  and  two  countries  of  interest  (Iraq  and 
North  Korea).  The  first  scenario  was  a  prewar  scenario  in  which  two  enemy  F-16s 
patrolled  a  nearby  border.  The  second  was  a  traditional  wartime  scenario  in  which  the 
blue  forces  were  tasked  to  defend  an  area  from  69  invading  aircraft.  The  third  was  an 
unconventional  scenario  incorporating  surprise  attacks. 

The  MA&D  study  is  similar  to  our  planned  OITL  experiment  in  that  the  simulated 
IADS  operators  were  working  in  the  Sector  Operations  Center,  were  making  critical 
high-level  command  decisions,  and  did  not  have  direct  control  over  the  early  warning 
(EW)  radars  or  weapons.  Values  for  the  cultural  variables  were  determined  from  inter¬ 
views  with  SMEs.  North  Korea  was  assigned  a  high  willingness  to  take  risk,  and  Iraq  was 
assigned  a  low  willingness  to  take  risk.  These  cultural  factors  affected  the  outcome  of  the 
scenario  in  relatively  predictable  ways.  Eor  example,  a  country’s  willingness  to  take  risk 
translated  into  more  firings  on  unknown  aircraft.  In  the  first  scenario.  North  Korea  was 
much  more  likely  than  Iraq  to  acquire  an  unknown  aircraft  with  targeting  radar  to  per¬ 
suade  the  aircraft  to  retreat.  Eikewise,  assigning  a  country  a  low  familiarity  with  the 
enemy  translated  into  less  effective  and  less  successful  offensive  strategies.  Although  this 
simulation  was  somewhat  contrived,  it  did  show  how  cultural  factors  can  affect  the  order, 
time,  and  locations  of  firing  assignments.  It  is  not  in  our  current  plan  to  consider  cultural 
variables  in  our  OITL  battle  manager  study.  Instead,  individual  differences  that  account 
for  performance-related  differences  across  all  cultures  will  be  taken  into  consideration 
(see  “individual  differences”  in  Table  1). 

C.  HOW  AUTOMATION  CAN  AUGMENT  OPERATOR  PERFORMANCE 

This  section  reviews  literature  related  to  Question  2; 

Under  what  circumstances  will  automation  improve  operator  performance 

and  optimize  operator  workload? 

Automated  systems  have  the  potential  to  increase  human  performance  by  carrying 
out  certain  mundane  functions,  allowing  the  human  to  concentrate  on  more  complex  cog¬ 
nitive  tasks.  For  example,  the  cruise  control  system  on  a  vehicle  alleviates  the  need  for 
the  driver  to  regulate  his  speed  and  allows  him  to  concentrate  on  other  vehicles’  motion, 
on  street  signs,  and  so  forth.  Automated  systems  can  also  augment  human  activity  by  car¬ 
rying  out  tasks  that  humans  are  not  physically  capable  of  performing  (e.g.,  weather 
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satellite  imaging)  or  by  performing  tasks  for  which  humans  show  inherent  limitations 
(e.g.,  real-time  calculation  of  the  distance  to  a  target). 

1.  Varying  Levels  of  Automation 

Battle  managers’  performance  will  vary  depending  on  the  specific  tasks  auto¬ 
mated  by  the  ABMA.  Not  all  tasks,  however,  are  good  candidates  for  automation. 
Kaempf,  Wolf,  and  Miller  (1993)  studied  the  decision-making  processes  of  an  air-to-air 
warfare  team  in  the  Combat  Information  Center  of  Aegis  cruisers.  The  researchers  found 
that  the  most  difficult  operator  tasks  involved  assessing  the  situation  and  obtaining  the 
information  needed  to  maintain  good  situational  awareness  (as  opposed  to  deciding 
whether  to  engage  a  threat).  By  the  time  the  operators  were  ready  to  make  the  decision  to 
engage,  they  had  already  obtained  the  information  they  needed.  At  that  point,  they  just 
followed  the  instructions  set  out  in  the  standard  operating  procedures  (SOPs).  This  work 
indicates  that  the  ABMA  will  affect  operator  performance  if  it  assists  the  operator  with 
the  most  complex  decision-making  tasks,  including  assessing  the  situation. 

Automation  can  enhance  system  and  human  performance  at  four  different  stages 
(see  Figure  4)  (Sheridan  &  Parasuraman,  2006): 

•  The  first  stage  involves  the  acquisition  of  information  (e.g.,  from  sensors  or 
fire  units  via  communication  networks). 

•  The  second  stage  involves  the  representation  and  display  of  the  information 
on  the  human-machine  interface  (HMI).  Although  automation  during  this 
stage  is  not  necessarily  aimed  at  decision  support,  it  can  make  a  significant 
difference  in  performance.  For  example,  a  study  by  Smith,  Johnston,  and 
Paris  (2004)  showed  that  Naval  officers  in  an  Aegis  Combat  Information 
Center  who  viewed  information  on  set  of  specialized  displays  were  signifi¬ 
cantly  less  likely  to  misclassify  and  target  commercial  aircraft  than  the  Naval 
officers  who  used  a  standard  Navy  training  system. 

•  The  third  stage  at  which  automation  can  enhance  performance  is  the  deci¬ 
sion-making  stage.  Our  planned  OITL  experiment  includes  provisions  for 
two  different  experimental  baselines.  The  first  baseline  represents  the  current 
system  state  in  which  none  of  the  stages  are  augmented  through  automation. 
For  example,  under  this  baseline,  the  battle  manager  must  request  status 
information  from  sensor  and  fire  units.  The  second  baseline  includes  system 
functions  that  automate  the  first  two  stages  (information  acquisition  and  dis¬ 
play).  Building  upon  this  second  baseline,  our  planned  OITL  experiment  will 
be  designed  to  test  three  distinct  types  of  automated  decision-aiding  that 
augment  the  third  stage  (decision-making). 
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The  fourth  stage  is  the  implementation  stage;  however,  this  will  not  be 
addressed  in  this  study  beeause  of  the  foeus  on  battle  management,  eom- 
mand,  and  control. 
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Figure  4.  Four  stages  at  which  automation  can  enhance  system  and  human  performance 
(adapted  from  Sheridan  &  Parasuraman,  2006) 

The  planned  OITL  experiments  will  focus  on  identifying  the  type  of  automation 
that  augments  the  third  stage  in  Figure  4  (decision-making).  The  type  of  automation  var¬ 
ies  along  two  dimensions:  the  specific  decision-making  task  being  automated  and  the 
level  at  which  the  system  is  automating  that  task.  Table  3  shows  the  eight  levels  of  auto¬ 
mation  that  have  been  documented  and  applied  in  practice  (also  see  Sheridan,  1992; 
Sheridan  &  Parasuraman,  2006).  Each  level  can  be  applied  at  each  of  the  four  stages 
described  previously. 


Table  3.  Levels  of  automation 
(from  Sheridan  &  Parasuraman,  2006,  p.  94)"* 


Level 

Response 

1 

The  computer  offers  no  assistance.  The  humans  must  do  it  all. 

2 

The  computer  suggests  alternative  ways  to  do  the  task. 

3 

The  computer  selects  one  way  to  do  the  task  and 

4 

executes  that  suggestion  if  the  human  approves  OR 

5 

allows  the  human  a  certain  amount  of  time  to  veto  before  automatic  execution  OR 

6 

executes  the  suggestion  automatically  and  then  necessarily  informs  the  human  OR 

7 

executes  the  suggestion  automatically  and  then  informs  the  human  only  if  asked. 

8 

The  computer  selects  the  method,  executes  the  task,  and  ignores  the  human 

^  This  table  is  a  condensed  version  of  the  full  10-level  taxonomy  described  in  Sheridan  (1992;  Sheridan 
&  Verplank,  1978). 
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Level  1  in  Table  3  represents  our  seeond  baseline  for  the  planned  OITL  experi¬ 
ments.  Information  regarding  the  status  of  sensor  and  weapon  systems  and  inventories  are 
automatically  acquired  and  displayed  on  the  battle  manager’s  interface.  The  human 
operator  must  then  perform  the  tasks  of  (1)  assessing  the  situation  and  prioritizing  threats, 
(2)  pairing  weapons  to  targets,  and  (3)  determining  the  acceptability  of  the  engagement 
selections.  For  each  of  these  tasks,  the  ABMA  has  the  option  of  suggesting  alternative 
ways  to  do  the  task  (Level  2),  selecting  one  way  to  do  the  task  but  allowing  the  human 
operator  to  override  this  selection  (Level  5),  or  performing  the  task  (Level  8).  This  results 
in  nine  possible  ABMA  configurations,  in  addition  to  the  two  baselines  (see  Table  4). 


Table  4.  Nine  possible  configurations  that  consider  the 
battle  manager’s  tasks  and  the  three  ABMA  levels 


ABMA  Levels 

Battle  Manager 
Task 

ABMA  suggests 
alternatives 
(Level  2) 

ABMA  selects  one  way 
to  do  the  task,  allows 
manual  override 
(Level  5) 

ABMA  performs 
the  task 
(Level  8) 

Assess  and  priori¬ 
tize  threats 

ABMA  suggests  pos¬ 
sible  threat 
prioritizations 

ABMA  selects  threat 
prioritization,  allows 
manual  prioritization 
changes 

ABMA  prioritizes 
threats  automatically 

Pair  weapons  to 
targets 

ABMA  suggests  all 
possible  weapon-tar- 
get  pairings 

ABMA  selects  weapon- 
target  pairings,  allows 
manual  pairing  changes 

ABMA  pairs  weapons- 

to-targets 

automatically 

Determine  the 
acceptability  of  the 
engagement 
selections 

ABMA  suggests 
acceptable  engage¬ 
ment  selections 

ABMA  selects  accept¬ 
able  engagement, 
allows  manual  engage¬ 
ment  overrides 

ABMA  determines 
acceptability  of 
engagements  and 
performs  engagement 

The  experimental  options  listed  in  Table  4  will  allow  us  to  determine  which  levels 
of  ABMA  improve  the  human  information  processing  and  decision-making  capabilities 
for  each  of  the  three  battle  manager  tasks.  While  our  planned  OITL  experiments  appear 
be  novel  to  AMD,  similar  studies  have  been  executed  in  other  related  domains,  including 
ballistic  missile  defense  (BMD),  Tomahawk  strike  planning,  and  air  traffic  control 
(described  in  that  order  in  this  section). 

In  2005,  the  Schafer  Corporation  (Schafer  Corporation,  15  January  2005)  studied 
the  effect  of  the  first  two  levels  of  automation  in  Table  3.  They  developed  an  automated 
decision  aid  in  the  form  of  an  intelligent  agent  for  the  Ground-Based  Midcourse  Defense 
(GMD)  Fire  Control  (GFC)  system.  This  project  was  funded  on  a  Phase  II  Small 
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Business  Innovative  Researeh  (SBIR)  eontract  sponsored  by  the  GMD  program  man¬ 
agement  office  (GFC  Products  Division).  The  Northrop  Grumman  Corporation,  which 
was  in  the  process  of  developing  the  GFC,  subcontracted  to  Schafer  to  develop  and  test 
the  decision  aids.  During  the  testing,  GFC  battle  managers  were  asked  to  decide  whether 
and  when  to  override  the  automated  battle  management  algorithms.  Such  decisions 
included  engaging  a  track,  ordering  a  cease  fire,  and  setting  the  minimum  and  maximum 
number  of  intercepts  allocated  to  negate  the  reentry  vehicle  (RV)  or  track  cluster.  The 
decision  aid  performed  four  primary  tasks,  all  of  which  involved  highlighting  portions  of 
the  display  to  raise  the  battle  manager’s  awareness  of  certain  conditions.  The  highlighting 
was  applied  to  (1)  asset  values,  (2)  RV 
likelihood  information  in  cases  with 
uncertainties  due  to  sensor  tracking,  (3) 

RV  likelihood  information  in  cases  with 
uncertainties  due  to  booster  parenting,  and  (4)  clusters  of  tracks  that  share  the  same 
impact  region  or  booster  parent  as  another  missile  with  an  override.  Schafer  Corporation 
conducted  an  experiment  that  tested  the  performance  of  15  battle  managers,  with  and 
without  the  decision  aid.  The  participants  included  uniformed  GFC  battle  managers  and 
civilian  SMEs.  Their  performance  was  measured  by  calculating  their  task  accuracy  and 
reaction  time.  In  all  the  conditions,  the  battle  managers  completed  the  tasks  faster  with 
the  decision  aid,  and,  in  two  of  the  four  conditions,  they  completed  the  tasks  significantly 
faster.  In  all  the  conditions,  the  battle  managers’  accuracy  was  also  significantly  better 
with  the  decision  aid. 

Cummings  and  Bruni  (in  press)  studied  the  effect  of  the  first  three  levels  of  auto¬ 
mation  in  Table  3  on  Naval  operators’  decision-making  performance  in  a  Tomahawk 
Land  Attack  Missile  (TEAM)  planning  domain.  The  study  involved  the  development  of 
automated  and  partially  automated  decision  aids  that  assist  Tomahawk  strike  planners  in 
a  multiple  resource  allocation  problem.  The  planners  were  asked  to  assign  missiles  to 
missions  by  taking  into  account  factors  such  as  the  characteristics  of  each  of  the  planned 
missions  (e.g.,  target,  route,  launch  basket),  the  characteristics  of  the  available  missiles 
(e.g.,  type,  ship  and  launch  basket  required,  warhead),  each  ship’s  rate  of  success  for  mis¬ 
sile  launches,  and  other  constraints  such  as  the  number  of  days  to  port  for  each  candidate 
ship. 

The  first  interface  Cummings  and  Bruni  (in  press)  tested  (Interface  1)  required  the 
operators  to  perform  the  missile-to-mission  matching  manually.  The  interface  did  filter 
the  available  information,  which  prevented  the  operators  from  matching  missiles  to 


Decision  aids  that  raise  battle  man¬ 
agers’  awareness  of  critical  condi¬ 
tions  can  increase  task  accuracy 
while  decreasing  latency. 
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missions  in  unfeasible  eombinations.  The  second  interface  (Interface  2,  shown  in  Figure 
5)  provided  some  decision-support  tools.  These  tools  included  tables  showing  missiles 
that  can  be  matched  to  missions  according  to  criteria,  such  as  priorities,  that  the  operator 
can  enter.  It  also  included  an  “Automatch”  button  that  automatically  matched  and  priori¬ 
tized  missiles  to  missions  in  order  of  mission  importance.  This  version  allowed  the 
operator  to  perform  “what  if’  comparisons  and  save  them  for  future  planning.  The  third 
interface  (Interface  3)  was  a  higher  level  display  that  did  not  graphically  represent  spe¬ 
cific  missile-to-mission  pairings.  It  required  the  user  to  input  his  constraints,  criteria,  and 
priorities  using  graphical  slider  bars.  The  automated  system  then  attempted  to  optimize 
the  resources  available  to  meet  the  given  criteria  and  produced  the  best  possible  missile- 
to-mission  matches  according  to  an  optimization  algorithm. 

Twenty  U.S.  Naval  officers  tested  five  combinations  of  the  interface  designs; 
Interfaces  I,  2,  and  3  separately.  Interfaces  1  and  3  together,  and  Interfaces  2  and  3 
together  (Bruni  &  Cummings,  2007).  Operator  performance  was  measured  by  an  objec¬ 
tive  weighting  function  that  calculated  a  weighted  sum  of  the  percentages  of  correct  mis- 
sile-to-mission  matches  according  to  mission  priority.  Those  missions  that  had  higher 
priorities  contributed  more  heavily  to  an  operator’s  measure  of  performance.  The  results 
showed  that  Interface  1,  the  manual  matching  interface,  and  the  combination  of  the  two 
automated  decision-support  interfaces  (Interfaces  2  and  3  together)  generated  signifi¬ 
cantly  better  operator  performance  than  the  three  other  conditions.  Interface  1  may  have 
produced  good  results  because  the  operators  explained  that  they  were  familiar  with  simi¬ 
lar  types  of  manual  missile-to-mission  matching  interfaces;  however,  this  explanation 
does  not  indicate  why  the  combination  of  Interfaces  1  and  3  generated  the  worst 
performance. 

Cummings  and  Bruni  (in  press)  also  found  that  “the  highest  level  of  automation. 
Automatch,  seemed  to  improve  the  mission- 
missile  matching  process.  According  to 
users’  feedback,  the  Automatch  function 
allowed  for  faster  computation  of  solutions. 

However,  it  was  not  always  used,  and,  in  many  cases,  participants  exhibited  significant 
distrust  in  the  Automatch,  by  constantly  cross-checking  the  automation’s  solution,  which 
was  expensive  in  terms  of  time”  (p.  12). 


Automated  decision  aids  will  assist 
the  human  operator  as  long  as  he  can 
still  access  the  decision-making  data. 
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Figure  5.  Partial  automation  support  for  the  Tomahawk  missile-to-mission  planners 

(Cummings  &  Bruni,  in  press) 

The  third  interface  (which  required  users  to  manipulate  graphical  slider  bars  to 
denote  constraints  and  priorities  and  automatically  performed  the  matching)  induced  user 
frustration  and  mistrust  because  users  “did  not  have  access  to  the  raw  data  and  did  not 
know  exactly  what  assignments  were  made  or  if  a  specific  missile  was  available.  This 
inability  to  ‘drill  down  into  the  detail’  is  a  known  limitation  of  configural  displays;  how¬ 
ever,  participants  were  able  to  adjust  their  strategies  accordingly  and  performed  as  well 
as  participants  with  other  interfaces”  (Cummings  &  Bruni,  in  press,  p.  13).  Participants 
who  used  Interfaces  1  and  2  explained  that  they  felt  compelled  to  look  at  all  the  available 
drill-down  information,  even  if  it  was  not  significant,  to  ensure  that  they  did  not  miss  any 
critical  information.  This  behavior  led  to  an  increase  in  the  solution  time  (Bruni  & 
Cummings,  2007). 

This  study  and  findings  demonstrate  the  utility  of  automation  in  assisting  the 
human  operator  and  the  importance  of  ensuring  that  the  operator  can  access  the  most 
essential  data  and  pairing  options — even  when  the  automated  system  will  be  recom¬ 
mending  or  making  pairing  decisions. 
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The  GMD  study  done  by  the  Sehafer  Corporation  (15  January  2005)  and  the 
TLAM  study  done  by  Bruni  and  Cummings  (2007)  demonstrate  the  value  of  automated 
deeision  aids  that  operate  at  Levels  2,  3,  and  4  in  Table  3.  These  tools  suggest  alternative 
ways  to  do  the  task,  recommend  a  particular  way  to  do  the  task,  and  may  even  execute 
the  recommendation  if  approved  by  the  human  operator.  The  level  of  automation  chosen 
for  implementation  will  affect  the  overall  human-system  performance.  This  issue  is 
addressed  next. 

In  the  late  1990s,  the  National  Academy  of  Sciences  (NAS)  convened  a  panel  on 
Human  Factors  in  Air  Traffic  Control  to  determine  what  levels  of  automation  are  most 
appropriate  for  which  air  traffic  control  tasks  (Wickens,  Mavor,  Parasuraman,  &  McGee, 
1998).  One  of  the  panel’s  main  findings  was  that  decision  aids  should  not  go  beyond  sug¬ 
gesting  preferred  alternatives  in  situations 
that  involve  a  considerable  degree  of 
uncertainty  and  risk.  The  reasons  for  this 
caution  include  loss  of  situational  aware¬ 
ness,  complacency,  and  skill  degradation  and  are  described  later  in  Section  D.  Decisions 
about  which  tasks  to  automate  and  to  what  degree  should  consider  the  reliability^ — not 
the  availability — of  the  automation  (Hawley  &  Mares,  2006).  The  panel  also  recom¬ 
mended  that  the  choice  of  automation  level  should  be  based  on  an  understanding  of 
human  behavioral  strengths,  tendencies  and  vulnerabilities,  and  the  consequences  of 
making  mistakes. 

All  the  NAS  panel  findings  are  directly  applicable  to  AMD  and  will  be  integrated 
into  our  study.  The  first  stage  of  the  planned  OITL  study  (which  will  address  Question  1) 
is  designed  to  foster  an  understanding  of 
the  cognitive  and  mental  capacity  of  AMD 
operators  who  have  varying  levels  of 
experience.  Then,  the  performance  of  each 
group  of  operators  (e.g.,  novice,  intermedi¬ 
ate,  expert)  will  be  tested  for  each  battle  manager  task  and  ABMA  level.  Both  the  reli¬ 
ability  of  the  ABMA  and  the  resulting  improvement  in  performance  will  be  considered  in 
assessing  the  risks  and  benefits  of  automating  each  task. 


The  choice  of  the  automation  level 
should  be  based  on  an  understanding 
of  human  behavioral  strengths,  ten¬ 
dencies  and  vulnerabilities,  and  the 
consequences  of  making  decisions. 


Decisions  about  which  tasks  to 
automate  and  to  what  degree  should 
consider  the  reliability — not  the 
availability — of  the  automation. 


^  Reliability  refers  to  the  deeision-making  eompetenee  of  the  automation,  not  to  its  operational 
eonsisteney. 
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2.  Augmenting  Air  and  Missile  Defense  Crew  Performance  Through  Automation 

The  benefits  of  automated  systems  go  beyond  enhancing  individual  human  per¬ 
formance,  to  changing  environmental  and  crew  configuration  requirements.  Although  our 
planned  OITL  experiments  will  consider  the  performance  of  an  individual  battle  manager 
under  varying  levels  of  automation,  battle  managers  seldom  work  alone.  They  operate  as 
part  of  a  crew  and  have  roles  such  as  weapons  assignment  officers,  air  defense  managers, 
communications  officers,  and  battle  manager  chiefs.  This  section  addresses  considera¬ 
tions  in  constructing  and  sustaining  crews  of  operators  and  automated  systems  with  com¬ 
plementary  responsibilities.  Many  of  the  findings  from  these  crew  performance  analyses 
substantiate  what  we  already  know  about  individual  battle  manager  performance.  They 
also  illuminate  the  key  concepts  that  will  need  to  be  remembered  in  the  future  when  the 
individual  OITL  experiments  are  expanded  to  crew-based  battle  management 
experiments. 

Increasing  the  number  of  crew  members  causes  an  exponential  increase  in  the 
number  of  ways  that  variables  such  as  crew  members’  behavioral  factors,  cognitive  abili¬ 
ties,  and  experience  can  be  combined  with  the  candidate  system  functions  and  their  cor¬ 
responding  levels  of  automation.  In  these  situations,  identifying  the  most  appropriate 
combinations  of  human  and  system  functions  to  operate  in  the  environment  becomes  dif¬ 
ficult.  An  unreasonable  number  of  traditional  human-in-the-loop  experiments  would  need 
to  be  designed,  executed,  analyzed,  and  synthesized  to  account  for  all  possible  experi¬ 
mental  conditions.  Computer-based  Human  Behavior  Representations  (HBRs)  in  con¬ 
structive  simulations  present  an  alternative  because  they  do  not  require  human  operators. 
HBRs  have  been  used  successfully  to  simulate  human  behaviors,  cognition,  and  perform¬ 
ance  in  complex  military  environments  (Morrison,  2003).  They  can  become  complex;  for 
example,  some  include  models  of  short-term  memory,  long-term  memory,  and  emotional 
behavior.  HBRs  have  been  used  in  virtual  (i.e.,  combinations  of  humans-in-the-loop  and 
computer-based  agents)  simulations  to  emulate  enemy  forces  or  to  supplement  friendly 
forces.  However,  the  distinct  advantage  of  HBRs  is  in  their  application  to  constructive 
simulations  in  which  human  operators  are  not  needed  to  execute  a  large  number  of  sce¬ 
narios,  to  model  operators’  reaction  and  behavior  under  thousands  of  combinations  of 
conditions,  to  assess  the  resulting  performance,  and  to  select  an  optimal  set  of  variables 
and  constraints.  The  missile  defense  studies  described  next  are  based  on  constructive 
simulations  that  employ  such  HBR  models. 
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Over  the  past  10  years,  the  air  defense  and  missile  defense  eommunities  have 
exhibited  a  keen  interest  in  studying  the  affeet  of  automation  on  erew  performanee.  In 
1997,  the  National  Missile  Defense  (NMD)  Joint  Program  Office  (JPO),  the  GMD  pro¬ 
ject  office,  the  Army  Research  Laboratory  (ARL),  the  U.S.  Army  Space  Command 
(ARSPACE),  the  U.S.  Air  Force  Space  Command  (AFSPC),  Boeing,  and  TRW  initiated 
a  major  missile  defense  operator  performance  modeling  effort.  The  effort  began  with  the 
development  of  a  181-page  Operator  Task  Fist  (OTL).  Two  separate  efforts  then 
attempted  to  simulate  the  tasks  on  this  list  in  the  context  of  realistic  scenarios  and  to  vali¬ 
date  the  simulation.  The  ARL  Human  Research  and  Engineering  Directorate  (HRED) 
funded  an  MA&D  effort  (1997-2003)  to  assess  battle  manager  workload.  This  project 
aimed  to  identify  the  optimal  number  of  battle  managers  needed  to  manage  a  typical 
BMD  battle  involving  about  five  ballistic  missile  threats.  In  2001,  Boeing  tasked  TRW  to 
run  a  similar  analysis,  and,  later,  the  two  independent  analyses  were  compared. 

In  the  2001  TRW  study  (September  2001),  only  the  highest  level  tasks  from  the 
OTF  (e.g.,  making  a  cease  engagement  or  weapons-free  decision)  were  considered. 
Crews  were  modeled  as  teams  of  operators  who  were  assigned  generic  roles  and  worked 
together  to  complete  the  tasks.  The  model  included  the  time  the  operators  needed  to  com¬ 
plete  each  task.  These  data  were  obtained  in  part  from  ARL  and  two  different  Battle 
Planning  Exercises  (BPEXs):  BPEX  99-1  and  BPEX  99-3.  Operator  and  crew  perform¬ 
ance  was  measured  by  calculating  their  task  completion  times.  Operator  stress  was  mod¬ 
eled  by  reducing  the  amount  of  time  required  to  complete  tasks  by  a  fixed  percentage 
(e.g.,  20%  in  one  case,  50%  in  another).  The  requirement  to  obtain  command  approval 
for  decisions  increased  the  time  required  to  complete  decision-making  tasks  by  another 
fixed  amount  (e.g.,  75  seconds).  This  study  considered  crews  of  three,  four,  and  five 
operators  and  determined  that  GMD  crews  perform  best  when  tasks  are  distributed 
among  five  battle  managers.  Factors  such  as  the  effort  needed  to  manage  crew  communi¬ 
cation  as  the  crew  size  increased  were  not  taken  into  account.  A  more  recent  study  by 
Aptima,  Inc.  (Paley,  Eevchuk,  Clark,  Miescher,  &  Baker,  2004)  showed  that  even  when 
crew  communication  overhead  is  taken  into  account,  increasing  the  size  of  the  crew 
decreases  the  workload  of  each  crew  member.  If  this  finding  is  true,  it  suggests  that  a  lar¬ 
ger  crew  size  of  six,  seven,  or  eight  operators  might  produce  even  better  performance. 

In  the  longer  term  MA&D  study  (Walters  &  Eabay,  2003a,  2003b;  Walters  & 
Pray,  2003),  operators  were  assigned  tasks  according  to  their  roles  (e.g.,  battalion  direc¬ 
tor,  battle  analyst,  sensors  operator,  weapons  operator,  communications  operator).  Battle 
management  crew  performance  was  calculated  in  terms  of  the  number  of  total  tasks  the 
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operators  could  perform,  the  time  to  complete  the  tasks,  the  number  of  tasks  that  were 
interrupted  (and  hence  dropped),  the  number  that  were  consequently  restarted,  and  the 
time  that  operators  spent  monitoring  the  situation.  This  study  was  an  intense  effort  to 
model  in  detail  about  315  of  the  tasks  in  the  extensive  OTL  and  their  low-level  task  con¬ 
tingencies.  MA&D  studied  the  performance  of  crews  made  up  of  4,  5,  and  6  operators 
during  2002-2003,  and  the  results  con¬ 
firmed  the  findings  of  the  earlier  2001 
TRW  study.  The  MA&D  study  showed  that 
a  five-person  crew  of  battle  managers  com¬ 
pleted  a  greater  number  of  tasks  overall  in  a  shorter  time  period,  dropped  fewer  tasks 
because  of  interruption,  restarted  a  greater  number  of  tasks  that  were  dropped,  and  spent 
more  time  monitoring  the  situation  and  gaining  situational  awareness.  One  of  the  main 
lessons  learned  was  the  difficulty  in  obtaining  accurate  task  times  for  the  tasks  in  the 
OTL,  in  particular  those  inherently  cognitive  decision-making  tasks. 

More  recently,  the  U.S.  Air  Force  Electronic  System  Command  funded  an  effort 
to  determine  the  optimal  operator  task  loading  and  crew  configuration  (e.g.,  who  should 
do  what,  when,  where)  to  conduct  a  Battle  Management,  Command  and  Control  (BMC2) 
mission  using  the  E-10  Multi-Sensor  Command  and  Control  Aircraft  (MC2A).  The 
E-lOA  MC2A  aircraft  supports  battle  management,  intelligence,  surveillance,  reconnais¬ 
sance,  and  selected  information  warfare  functions  (Levchuk,  Chopra,  Paley,  Levchuk,  & 
Clark,  2005;  Moore,  2004).  This  study  is  particularly  relevant  because  of  its  application 
to  battle  management  for  air  and  cruise  missile  defense. 

For  this  effort,  Aptima,  Inc.  (Eevchuk  et  ah,  2005;  Paley  et  ah,  2004),  under  con¬ 
tract  to  the  Massachusetts  Institute  of  Technology  (MIT)  Lincoln  Laboratory,  developed 
the  Team  Optimal  Design  (TOD)  model.  First,  they  created  a  model  that  described 
33  functions  (e.g.,  process  indications  and  warnings,  provide  threat  updates,  determine 
weapon-to-target  pairing)  that  battle  managers  carry  out  while  operating  the  MC2A. 
These  functions  were  derived  from  SME  working  groups,®  system  documentation,  and 
mission  scenarios  from  a  Virtual  Elag  training  exercise  (Paley  et  ah,  2004).  Each  function 
was  decomposed  into  a  task  flow  diagram  that  described  the  sequence  of  tasks  required  to 
fulfill  the  function’s  goals.  For  example,  the  function  “assess  active  threats”  involved 


®  The  working  groups  met  on  six  oeeasions  at  Langley  Air  Foree  Base  (AFB).  About  12  U.S.  Air  Foree 
aetive  duty  and  eivilian  SMEs  attended  eaeh  meeting,  and  a  eore  eompliment  of  about  5  SMEs 
attended  all  6  meetings  (personal  eommunieation  with  M.  Paley,  Aptima,  Ine.). 


Decision-making  tasks  that  are 
inherently  cognitive  contain  a  large 
amount  of  variability  across  battle 
managers. 
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sequences  of  tasks  such  as  “perform  risk  assessment  for  friendly  assets”  or  “identify  radar 
track  28.”  The  147  tasks  in  the  task  flow  diagrams  were  also  assigned  attributes  such  as 
duration,  workload,  and  information  requirements.  A  series  of  54  mission  events  were 
then  created  (e.g.,  TBM  Launch,  Red  EW  Radars  Active,  Red  Strike  package  ingress). 
These  mission  events  entailed  the  execution  of  the  already  defined  functions.^  One  repre¬ 
sentative  mission  required  a  25-person  MC2A  crew  to  complete  12,246  tasks  during  a 
6-hour  period. 

Tasks  with  similar  characteristics  were  grouped  into  representative  task  classes. 
The  TOD  model  considered  these  classes  of  tasks,  the  resources  available,  and  the  char¬ 
acteristics  of  the  battle  managers  (e.g.,  competence,  experience,  memory,  and  learning)  to 
compute  the  most  efficient  combination  of  battle  manager  roles  and  responsibilities  for  a 
given  scenario.  Although  the  computation  of  optimal  crew  configurations  may  not  be 
relevant  to  our  planned  OITL  experiment,  the  task,  workload,  and  accuracy  models  that 
Aptima,  Inc.  developed  to  reach  this  endpoint  can  similarly  be  applied  to  study  the  tem¬ 
poral  dynamics  of  battle  manager  performance  and  workload. 

The  TOD  model  was  designed  to  compute  the  workload  for  a  battle  manager  at 
time  t  as  a  function  of  the  classes  of  all  the  tasks  that  the  battle  manager  is  performing 
and  the  residual  workload  from  previous 
tasks  (which  fades  over  time).  Accuracy 
for  a  battle  manager  at  time  t  was  calcu¬ 
lated  as  a  function  of  the  battle  manager’s 
competence  (which  is  determined  by 
learning  rate,  memory,  and  training  experience)  and  workload  at  that  time.  For  situations 
in  which  workload  is  low,  one  can  choose  to  model  accuracy  as  high  (when  the  battle 
manager  performs  the  task  automatically)  or  low  (when  the  battle  manager  is  bored). 

The  Aptima  E-lOA  researchers  explained  that  an  optimal  crew  configuration 
might  not  exist.  Distributing  all  the  necessary  tasks  to  some  number  of  crew  members  so 
that  none  of  the  battle  managers  is  overloaded  might  not  be  possible.  In  that  case,  battle 
managers  may  end  up  with  overlapping  responsibilities,  which  would  increase  the  need 
to  communicate  and  coordinate.  The  overhead  of  this  communication  and  coordination 
then  factors  back  into  the  calculation  of  overall  workload.  One  of  the  most  important 


Changes  in  the  degree  and  type  of 
automation  changes  the  crew  com¬ 
position  requirements,  which,  in  turn, 
changes  the  crew  members’  roies  and 
overall  performance. 


^  The  assignment  of  specific  functions  to  scenario  events  was  determined  by  both  active  duty  and 
civilian  SMEs  from  a  number  of  military  and  DoD  organizations. 
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lessons  learned  involved  the  degree  of  system  automation.  Changes  in  the  degree  and 
type  of  automation  would  change  the  crew  composition  requirements.  In  turn,  the  results 
of  the  simulation  showed  that  changing  the  composition  of  the  crew  and  the  crew  mem¬ 
bers’  associated  roles  had  the  greatest  effect  on  the  overall  performance  (Paley  et  ah, 
2004). 

D.  HOW  AUTOMATION  CAN  HINDER  OPERATOR  PERFORMANCE 

This  section  reviews  literature  related  to  Question  3; 

Under  what  circumstances  might  automation  decrease  operator  perform¬ 
ance  and  situational  awareness  while  still  optimizing  operator  workload? 

In  this  section,  we  review  research  that  suggests  that  an  ABMA  may,  in  some 
cases,  decrease  operator  performance.  Although  the  intent  of  the  ABMA  is  to  decrease 
operator  workload,  it  may  increase  the  overall  level  of  cognitive  effort  required  by  the 
operator.  In  addition  to  requiring  the  operator  to  continue  to  assess  the  situation  and  for¬ 
mulate  his  own  decisions,  the  automated  system  would  require  the  operator  to  evaluate 
the  system’s  recommendations  and  compare  them  to  his  decisions  (Hilbum,  2004;  Miller 
&  Parasuraman,  2007).  These  requirements  may  also  lead  to  additional  job  preparation 
and  training  in  “managing  the  automated  battle  manager”  (Hawley,  Mares,  & 
Giammanco,  2006). 

Automated  decision  aids  can  also  produce  automation  bias,  a  condition  in  which 
operators  learn  to  rely  on  the  cues  presented  by  the  automated  system  as  a  replacement 
for  their  own  cognitive  effort,  human  information  seeking,  and  processing  (Mosier, 
Stitka,  Heers,  &  Burdick,  1998).  While  reliance  on  these  automated  decision  aids  can 
improve  performance  by  freeing  the  operator  from  attending  to  mundane  tasks  and  ena¬ 
bling  him  to  concentrate  on  complex  cognitive  tasks,  overreliance  results  in  accidents, 
especially  when  the  automated  system  fails  (Sheridan  &  Parasuraman,  2006). 

Automation  Bias,  Complacency,  and  Supervisory  Control  Effects 

From  the  late  1970s  though  the  1980s,  industrial  engineers,  human  factors 
researchers,  and  psychologists  were  concerned  about  the  way  that  human  information 
processing  errors  were  being  blamed  for  several  devastating  system  failures  (e.g.,  the 
meltdown  at  Three  Mile  Island  in  1979,  the  Korean  Airlines  plane  shot  down  by  Soviet 
fighters  in  1983,  the  USAir  B-737  crash  in  1989).  This  concern  led  to  an  abundance  of 
research  that  showed  how  operator  awareness  could  be  reduced  to  unsafe  levels  when  the 
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human  is  removed  from  the  eontrol  loop  and  an  automated  eomputer  eontroller  is  respon¬ 
sible  for  operating  the  system  (Kaber  &  Endsley,  1997).  For  example,  Wickens  (1992) 
showed  that  operators  respond  more  slowly  to  systems  that  are  running  in  an  automated 
mode.  Automation  can  hamper  the  development  and  maintenance  of  skills  required 
during  normal  manual  operations  and 
increase  the  time  required  to  train  these 
skills.  The  time  available  for  training  must 
be  distributed  across  courses  for  training 
fundamental  skills  and  courses  for  training  operators  how  to  manage  the  automated  sys¬ 
tem  (Hawley  et  ah,  2006). 

Critical  operational  errors  can  also  result  from  a  misallocation  of  appropriate 
functions  between  the  automated  system  and  the  human  operators  (Wickens,  1992).  In 
situations  where  finding  enough  skilled  operators  is  difficult  and  assigning  tasks  to  an 
automated  system  may  be  more  cost  effective,  operators  may  be  reduced  to  supervisory 
roles.  This  situation  places  them  out  of  the  control  loop  and  makes  them  susceptible  to 
attention-degradation  effects. 

Examples  of  automation  bias  effects  in  AMD  operations  are  not  uncommon. 
During  Operation  Iraqi  Freedom  (OIF),  operators’  overreliance  on  a  fallible  automated 
system  led  to  two  separate  fratricide  incidents  and  the  loss  of  three  flight  crew  members 
(Hawley,  2007).  The  automated  system  was  the  Army’s  Patriot  missile  defense  system, 
which  had  experienced  misclassification  errors  during  operational  tests  before  the  inci¬ 
dents.  The  operators’  performance  was  driven  by  battle  management  training  on  rote 
drills,  tactics,  techniques,  and  procedures.  Decision-making  for  cases  of  track  misidenti- 
fication  or  misclassification  was  not  comprehensively  covered  during  this  training.  In  the 
first  incident,  a  British  Tornado  was  misclassified  as  an  antiradiation  missile.  In  the  sec¬ 
ond  incident,  a  Navy  F/A-I8  was  misclassified  as  a  tactical  ballistic  missile  (TBM).  Both 
targets  were  engaged  and  destroyed. 

An  abundance  of  research  directed  by  the  Federal  Aviation  Administration  (FAA) 
through  the  1980s  and  1990s  studied  the  phenomena  of  automation  bias,  complacency, 
and  other  similar  issues  in  air  traffic  control.  For  example,  Endsley  and  Rodgers  (1996) 
studied  the  way  in  which  air  traffic  controllers  distributed  their  attention  among  aircraft 
while  observing  15  different  scenarios  that  contained  operational  errors.  They  used  the 
Situation  Assessment  Through  Re-creation  of  Incidents  (SATORI)  system  to  simulate  the 
data  from  actual  recorded  air  traffic  control  situations  and  synchronized  the  simulation 


Critical  operational  errors  can  result 
from  a  misallocation  of  functions 
between  the  automated  system  and 
the  human  operators. 
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with  audio  tapes  of  the  controller-pilot  communication.  Occasionally,  the  researchers 
froze  the  simulation,  blanked  out  the  screen,  and  asked  the  controllers  a  number  of  ques¬ 
tions.  The  study  showed  that  the  controllers  reported  only  about  67%  of  the  aircraft  pre¬ 
sent  on  the  display  and  did  not  generally  retain  detailed  aircraft  information  (e.g.,  call 
signs,  groundspeed,  and  direction).  This  low  level  of  situational  awareness  may  be 
explained  or  intensified  by  supervisory  control  effects  (described  next). 

In  Endsley  and  Rodgers’  study,  the  air  traffic  controllers  did  not  interact  with  the 
simulation.  Instead,  they  passively  monitored  the  system.  The  activities  involved  in  pas¬ 
sively  monitoring  an  air  traffic  control  simulation  may  be  similar  to  the  monitoring 
activities  involved  in  a  highly  automated 
environment  with  an  ABMA  that  provides 
the  maximum  level  of  automation.  In  this 
passive  mode,  the  operator  may  not  achieve 
the  same  level  of  attention  and  situational 
awareness  as  when  he  actively  monitors 
and  controls  the  system.  Thus,  while  automation  may  reduce  operator  workload,  it  may 
also  have  the  side  effect  of  decreasing  operator  activity,  engagement,  and  attention.  When 
this  happens,  operator  situational  awareness  and  performance  may  also  decrease.  If  this  is 
true,  it  may  explain  the  relatively  low  level  of  situational  awareness  observed  in  Endsley 
and  Rodgers’  study. 

This  effect  can  be  exacerbated  in  the  presence  of  novice  operators  who  do  not 
have  the  tactical  and  technical  knowledge  needed  to  understand  the  system  decision 
processes.  Research  by  the  U.S.  Navy  suggests  that  a  battle  manager’s  level  of  experi¬ 
ence  and  his  tactical  and  technical  expertise  are  directly  related  to  his  ability  to  maintain 
the  situational  awareness  needed  to  supervise  a  fully  automated  system  effectively 
(Hawley  &  Mares,  2006).  Eor  our  planned  OITE  experiments,  this  research  suggests  that 
a  novice  operator’s  performance  may  degrade  faster  than  an  expert  operator’s  perform¬ 
ance  under  the  fully  automated  ABMA  condition  (Eevel  8  in  Table  3). 

Kaber  and  Endsley  (1997)  conducted  a  study  that  examined  the  specific  combina¬ 
tions  of  human  operator  and  automated  system  coordination  that  increase  (or  decrease) 
overall  system  performance.  It  drew  upon  a  taxonomy  that  described  10  graded  levels  of 
automation  from  strict  manual  control  to  fully  automated  (see  Table  3  for  a  condensed 
version). 


While  automation  may  reduce  opera¬ 
tor  workload,  it  may  also  decrease 
operator  activity,  engagement,  and 
attention,  which  could  lead  to  a 
decrease  in  situational  awareness  and 
performance. 
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During  the  experiment,  subjects  were  asked  to  eliminate  simulated  targets  that 
were  moving  toward  the  center  of  the  screen.  In  some  cases,  the  human  operator  per¬ 
formed  the  functions  of  monitoring  the  system  status,  generating  strategies  for  elimi¬ 
nating  targets,  selecting  a  particular  strategy,  and  implementing  this  strategy.  In  other 
cases,  the  system  performed  various  combinations  of  these  functions.  Kaber  and  Endsley 
found  that  overall  performance  degraded  under  the  strictly  manual  control  condition  and 
all  other  conditions  in  which  some  of  the  tasks  were  automated  but  that  the  human  was 
ultimately  tasked  with  implementing  the  plan.  When  the  level  of  automation  varied 
across  time,  subjects  had  difficulty  recovering  from  situations  in  which  automation 
included  advanced  queuing  of  targets.  They  became  accustomed  to  focusing  on  future 
tasks  and  tended  to  neglect  present  state  incidents. 

This  experiment  examined  combinations  of  human  operator  and  automated  sys¬ 
tem  functions  and  confirmed  the  importance  of  establishing  the  appropriate  allocation 
and  coordination  between  these  functions.  Neither  this  experiment  nor  the  earlier  one 
(Endsley  &  Rodgers,  1996)  studied  the  dynamics  of  how  air  traffic  controllers’  attention 
changes  as  the  number  of  aircraft  in  the  simulation  increases  or  decreases.  We  expect  this 
to  be  a  focal  point  of  the  OITL  AMD  experiment. 

Cummings  and  Mitchell  (2006)  studied  the  workload  and  performance  of 
12  operators  as  they  supervised  4  simulated  unmanned  aerial  vehicles  (UAVs)  that  were 
tasked  to  destroy  a  set  of  time-sensitive  targets.  (Nine  of  the  12  participants  were  active 
duty  United  States  Air  Eorce  (USAE)  officers  or  Reserve  Officer  Training  Corps  (ROTC) 
students.)  The  operators  were  responsible  for  tasks  such  as  assigning  or  unassigning  tar¬ 
gets  to  UAV  mission  plans,  arming  and  firing  payloads,  and  ordering  UAVs  to  return  to 
base.  Three  conditions  representing  different  levels  of  automated  decision  support  were 
tested: 

•  The  first  level  involved  a  manual  decision-aid  display  containing  a  series  of 
visual  timelines  showing  the  scheduling  of  ATO  events  associated  with  each 
UAV  (see  Eigure  6). 

•  The  second  level  included  an  automated  decision  aid  in  the  form  of  the  visual 
timelines  alongside  a  series  of  computer-based  recommendations  that  the 
operator  could  accept  or  reject. 

•  The  third  level  was  a  fully  automated  management-by-exception  system  that 
executed  arming  and  firing  actions  according  to  the  rules  of  engagement. 
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Figure  6.  Visual  timeline  decision  aid  for  managing  and  scheduling  the  UAV  ATO 

(from  Cummings  &  Mitchell,  2006) 

Each  condition  progressively  automated  more  functions  involved  in  the  manage¬ 
ment  and  scheduling  of  the  ATO.  Operators  using  the  fully  automated  system  had  an 
opportunity  to  intervene  and  veto  each  action  30  seconds  before  it  occurred.  The  scenar¬ 
ios  were  scripted  according  to  two  levels  of  difficulty  (high  replanning  and  low  replan¬ 
ning)  depending  on  the  frequency  of  replanning  that  was  required  in  the  scenario  to 
address  emergent  threats,  new  tasking  from  superiors,  and  system  failures. 

The  results  of  Cummings  and  Mitchell’s  study  showed  that  in  the  high  replanning 
condition,  the  operators  who  used  the  automated  timeline  decision  aid  had  lower  per¬ 
formance  scores  and  higher  subjective  workload  scores  than  those  who  used  the  manual 
timeline  and  those  who  used  the  fully  automated  management-by-exception  system.  In 
fact,  the  automated  decision  aid  produced  the  poorest  scores  overall  and  the  lowest  situ¬ 
ational  awareness  (measured  by  the  operators’  subjective  assessment  of  their  comprehen¬ 
sion  of  the  current  situation).  These  results  suggest  that  even  an  arbitrary  and 
conservative  level  of  automation  does  not  necessarily  improve  performance  under  high- 
workload  conditions.  This  study  also  illustrates  the  complexity  of  the  interaction  among 
the  human  and  automation-related  variables  that  affect  the  resulting  workload  and  per¬ 
formance  of  the  human-computer  partnership. 

One  such  complex  interaction  involves  the  human  operator’s  perception  and  trust 
of  the  system’s  recommendations.  Skitka,  Mosier,  and  Burdick  (1999)  point  out  the 
extensive  research  in  social  psychology  showing  that  a  person  is  likely  to  harm  others  if 
directed  to  by  an  authority  figure.  To  the  extent  that  a  person  perceives  an  automated 
decision  aid  as  having  authority,  there  is  reason  to  believe  that  the  person  might  similarly 
follow  the  system’s  recommendations  without  further  consideration.  Stitka  et  al.  (1999) 
tested  this  hypothesis  by  asking  80  participants  to  perform  a  set  of  tasks  designed  to 
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simulate  the  monitoring  and  traeking  of  eommereial  airlines.  In  the  experimental  eondi- 
tion,  the  participants  had  access  to  an  automated  monitoring  aid  that  prompted  them 
about  various  system  events.  In  the  con¬ 
trol  condition,  the  participants  used  the 
same  system  but  did  not  have  access  to 
the  decision  aid.  The  participants  were 
specifically  told  that  the  automated  deci¬ 
sion  aid  was  not  always  accurate  and  that  the  other  gauges  and  instruments  (available  in 
both  conditions)  were  always  100%  accurate.  The  results  of  this  experiment  confirmed 
Stitka  et  al.’s  hypothesis.  Not  only  did  the  participants  in  the  experimental  condition  fol¬ 
low  the  advice  of  the  decision  aid  when  the  other  instruments  provided  contradictory  evi¬ 
dence,  but  they  were  also  less  vigilant  than  the  control  condition,  missing  a  significantly 
larger  number  of  events  that  occurred  without  a  system  prompt. 

E.  SUMMARY 

This  report  summarizes  the  findings  from  an  array  of  literature  related  to  how  an 
AMD  battle  manager’s  performance  degrades  with  increased  workload  in  the  context  of 
various  realistic  scenarios.  We  also  discussed  how  an  ABMA  can  moderate  this  degrada¬ 
tion.  The  reviewed  literature  was  organized  according  to  three  research  questions: 

1.  Without  automation  assistance,  how  many  decisions  can  an  operator  handle 
per  unit  time?  At  what  point  does  operator  performance  drop  off,  and  does  it 
drop  off  gradually  or  abruptly? 

2.  Under  what  circumstances  will  automation  improve  operator  performance 
and  optimize  operator  workload? 

3.  Under  what  circumstances  might  automation  decrease  operator  performance 
and  situational  awareness  while  still  optimizing  operator  workload? 

For  the  first  question,  without  the  assistance  of  automation,  a  battle  manager’s 
performance  will  degrade  as  the  complexity  of  the  task  increases.  If  human  memory 
capacity  is  limited  in  the  way  the  psychological  and  air  defense  research  suggests,  we 
should  expect  the  performance  of  an  AMD  battle  manager  to  decline  rapidly  when  he 
becomes  overloaded  with  more  than  seven  entities  or  decisions.  The  complexity  of  the 
task  can  be  mediated  by  several  factors,  including  the  fidelity  and  design  of  the  operator 
display  and  the  level  of  automation  of  the  system.  An  operator  can  improve  his  perform¬ 
ance  by  increasing  his  cognitive  capacity,  restructuring  his  knowledge,  and  gaining 


To  the  extent  that  a  person  perceives  an 
automated  decision  aid  as  having 
authority,  he  will  follow  the  system’s 
recommendations  even  in  the  face  of 
contradictory  evidence. 


33 


experience  in  the  domain.  Other  factors,  such  as  an  operator’s  risk-taking  and  cultural 
behaviors,  can  also  affect  his  performance. 

For  the  second  question,  one  of  the  prominent  factors  that  affects  an  operator’s 
performance  is  the  level  of  automation.  Section  C.l  outlined  four  different  stages  and 
eight  different  levels  at  which  automation  can  enhance  system  and  human  performance. 
One  of  the  studies  reviewed  (Bruni  &  Cummings,  2007;  Cummings  &  Bruni,  in  press) 
demonstrated  the  utility  of  automation  in  assisting  the  human  battle  manager  and  the 
importance  of  ensuring  that  he  has  access  to  the  data  needed  for  decision-making,  even 
when  the  automated  system  will  be  recommending  or  making  pairing  decisions.  At  the 
same  time,  this  study  indicated  that  providing  an  extensive  amount  of  drill-down  infor¬ 
mation  in  a  time-sensitive  situation  would  compel  the  battle  manager  to  review  all  the 
data,  increasing  the  problem-solving  time.  Section  C.2  explained  how  changes  in  the 
degree  and  type  of  automation  introduced  into  the  system  would  change  the  crew  compo¬ 
sition  requirements.  In  turn,  changing  the  composition  of  the  crew  and  the  crew  mem¬ 
bers’  associated  roles  affected  their  overall  performance. 

For  the  third  question,  an  abundance  of  research  indicates  that  while  automation 
may  decrease  operator  workload,  it  may  also,  paradoxically,  increase  the  overall  level  of 
cognitive  effort  required  by  the  operator.  In  addition  to  requiring  operators  to  continue  to 
assess  the  situation  and  formulate  their  own  decisions,  automated  systems  require  opera¬ 
tors  to  evaluate  the  system’s  recommendations  and  compare  these  recommendations  with 
their  own.  This  additional  cognitive  effort  in  “managing  the  automated  battle  manager”  is 
also  prone  to  the  consequences  of  automation  bias,  a  situation  in  which  operators  learn  to 
trust  and  rely  on  the  cues  that  are  presented  by  the  automated  system  as  a  replacement  for 
their  cognitive  effort,  human  information  seeking,  and  processing.  There  is  no  shortage  of 
research  showing  how  overreliance  on  automation  results  in  fatal  accidents  when  the 
automated  system  fails. 

Some  of  the  studies  described  in  this  report  have  direct  implications  for  the  design 
and  analyses  of  our  OITL  experiments  and  the  design  of  the  battle  manager  simulation 
interface.  Because  different  battle  managers  are  likely  to  experience  different  workloads 
at  different  times  for  the  same  scenario,  the  OITL  experiment  participants  should  include 
battle  managers  who  have  a  range  of  abilities  (e.g.,  novice,  intermediate,  expert)  and 
backgrounds.  The  simulation  should  alert  the  battle  manager  about  critical  situations,  and 
the  design  of  the  interface  should  not  impede  the  battle  manager’s  ability  to  access  the 
decision-making  data.  To  the  extent  possible,  the  simulation  environment  should 
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encourage  a  trusting  human-system  partnership,  but  it  should  not  induce  a  false 
perception  of  trust  if  the  decision-making  data  are  not  reliable.  The  reliability  of  the 
decision-making  data  should  be  apparent  to  the  battle  manager  and  should  play  a  major 
role  in  assessing  the  most  appropriate  level  of  automation.  A  battle  manager’s 
performance  is  expected  to  degrade  as  the  complexity  (e.g.,  the  timing,  quantity  and  order 
of  events,  and  degree  of  uncertainty)  of  the  task  increases.  As  the  battle  manager’s 
performance  degrades,  the  most  appropriate  level  of  automation  should  be  based  on  an 
understanding  of  human  behavioral  strengths,  tendencies,  and  vulnerabilities  and  on  the 
consequences  of  making  mistakes.  If  automation  decreases  operator  activity,  engage¬ 
ment,  and  attention  and  leads  to  a  decrease  in  situational  awareness  and  performance, 
battle  manager  performance  should  decrease  at  the  highest  ABMA  level.  The  OITL 
experiments  should  be  designed  to  test  these  hypotheses,  and  the  selected  analysis 
methods  should  address  these  considerations. 
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